Overview

Dataset statistics

Number of variables50
Number of observations864
Missing cells5714
Missing cells (%)13.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory337.6 KiB
Average record size in memory400.1 B

Variable types

NUM25
BOOL14
CAT11

Warnings

학부_편입여부 is highly correlated with 학생통계번호High correlation
학생통계번호 is highly correlated with 학부_편입여부High correlation
11학기성적 is highly correlated with 토익점수 and 5 other fieldsHigh correlation
토익점수 is highly correlated with 11학기성적High correlation
7학기성적 is highly correlated with 11학기성적High correlation
직전2학기평균 is highly correlated with 11학기성적High correlation
성적오름추세 is highly correlated with 11학기성적High correlation
성적기울기 is highly correlated with 11학기성적High correlation
직전2학기증가율 is highly correlated with 11학기성적High correlation
대학원_입학연도 is highly correlated with 대학원_졸업연도High correlation
대학원_졸업연도 is highly correlated with 대학원_입학연도High correlation
학부_학과 is highly correlated with 학부_단과대학High correlation
학부_단과대학 is highly correlated with 학부_학과High correlation
대학원_학과 is highly correlated with 대학원_계열High correlation
대학원_계열 is highly correlated with 대학원_학과High correlation
토익점수 has 540 (62.5%) missing values Missing
2학기성적 has 15 (1.7%) missing values Missing
3학기성적 has 87 (10.1%) missing values Missing
4학기성적 has 104 (12.0%) missing values Missing
5학기성적 has 212 (24.5%) missing values Missing
6학기성적 has 261 (30.2%) missing values Missing
7학기성적 has 396 (45.8%) missing values Missing
8학기성적 has 487 (56.4%) missing values Missing
9학기성적 has 773 (89.5%) missing values Missing
10학기성적 has 840 (97.2%) missing values Missing
11학기성적 has 860 (99.5%) missing values Missing
12학기성적 has 863 (99.9%) missing values Missing
대학원_졸업연도 has 276 (31.9%) missing values Missing
11학기성적 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
학생통계번호 has unique values Unique
장학금액 has 195 (22.6%) zeros Zeros
성적오름추세 has 26 (3.0%) zeros Zeros
성적기울기 has 25 (2.9%) zeros Zeros
성적기울기_직전4학기 has 23 (2.7%) zeros Zeros
직전2학기증가율 has 51 (5.9%) zeros Zeros
km_5 has 172 (19.9%) zeros Zeros

Reproduction

Analysis started2021-01-15 04:01:13.271946
Analysis finished2021-01-15 04:02:14.709719
Duration1 minute and 1.44 second
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIQUE

Distinct864
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean431.5
Minimum0
Maximum863
Zeros1
Zeros (%)0.1%
Memory size6.8 KiB

Quantile statistics

Minimum0
5-th percentile43.15
Q1215.75
median431.5
Q3647.25
95-th percentile819.85
Maximum863
Range863
Interquartile range (IQR)431.5

Descriptive statistics

Standard deviation249.5596121
Coefficient of variation (CV)0.5783536781
Kurtosis-1.2
Mean431.5
Median Absolute Deviation (MAD)216
Skewness0
Sum372816
Variance62280
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
86310.1%
 
28310.1%
 
29410.1%
 
29310.1%
 
29210.1%
 
29110.1%
 
29010.1%
 
28910.1%
 
28810.1%
 
28710.1%
 
Other values (854)85498.8%
 
ValueCountFrequency (%) 
010.1%
 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
ValueCountFrequency (%) 
86310.1%
 
86210.1%
 
86110.1%
 
86010.1%
 
85910.1%
 

학생통계번호
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct864
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7667767085
Minimum37114635
Maximum8878014647
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum37114635
5-th percentile442005648.4
Q18126944634
median8217054645
Q38447549646
95-th percentile8848422648
Maximum8878014647
Range8840900012
Interquartile range (IQR)320605011.8

Descriptive statistics

Standard deviation2211241791
Coefficient of variation (CV)0.2883814501
Kurtosis7.029547139
Mean7667767085
Median Absolute Deviation (MAD)129594998.5
Skewness-2.970270847
Sum6.624950761e+12
Variance4.889590257e+18
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
844734463710.1%
 
802696464110.1%
 
844714463710.1%
 
855722463610.1%
 
802569464910.1%
 
844825464510.1%
 
844697464110.1%
 
803737464010.1%
 
844825463810.1%
 
884872465010.1%
 
Other values (854)85498.8%
 
ValueCountFrequency (%) 
3711463510.1%
 
3727464010.1%
 
3731463710.1%
 
4218464910.1%
 
8692464110.1%
 
ValueCountFrequency (%) 
887801464710.1%
 
887799465210.1%
 
887752465410.1%
 
887745465610.1%
 
887713464910.1%
 

성별
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
남자
617 
여자
247 
ValueCountFrequency (%) 
남자61771.4%
 
여자24728.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

학부_단과대학
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
공과대학
415 
에너지바이오대학
181 
정보통신대학
126 
조형대학
80 
기술경영융합대학
58 
Other values (2)
 
4
ValueCountFrequency (%) 
공과대학41548.0%
 
에너지바이오대학18120.9%
 
정보통신대학12614.6%
 
조형대학809.3%
 
기술경영융합대학586.7%
 
인문사회대학20.2%
 
미래융합대학20.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length8
Median length4
Mean length5.407407407
Min length4

학부_학과
Categorical

HIGH CORRELATION

Distinct41
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
기계시스템디자인공학과
118 
기계·자동차공학과
85 
정밀화학과
65 
신소재공학과
61 
전기정보공학과
59 
Other values (36)
476 
ValueCountFrequency (%) 
기계시스템디자인공학과11813.7%
 
기계·자동차공학과859.8%
 
정밀화학과657.5%
 
신소재공학과617.1%
 
전기정보공학과596.8%
 
건설시스템공학과536.1%
 
전자IT미디어공학과455.2%
 
환경공학전공435.0%
 
식품공학과333.8%
 
건축공학전공313.6%
 
Other values (31)27131.4%
 
Frequencies of value counts

Unique

Unique7 ?
Unique (%)0.8%
Histogram of lengths of the category

Length

Max length13
Median length7
Mean length7.631944444
Min length4
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
주간
818 
야간
 
46
ValueCountFrequency (%) 
주간81894.7%
 
야간465.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

학부_편입여부
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
782 
1
82 
ValueCountFrequency (%) 
078290.5%
 
1829.5%
 
Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
신입학
779 
3학년편입학
 
76
3학년편입위탁
 
6
신입재입학
 
2
신입위탁
 
1
ValueCountFrequency (%) 
신입학77990.2%
 
3학년편입학768.8%
 
3학년편입위탁60.7%
 
신입재입학20.2%
 
신입위탁10.1%
 
Frequencies of value counts

Unique

Unique1 ?
Unique (%)0.1%
Histogram of lengths of the category

Length

Max length7
Median length3
Mean length3.297453704
Min length3

학부_입학년도
Real number (ℝ≥0)

Distinct16
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.967593
Minimum2003
Maximum2018
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2003
5-th percentile2006
Q12009
median2011
Q32013
95-th percentile2016
Maximum2018
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.963097208
Coefficient of variation (CV)0.001473468404
Kurtosis-0.4783011779
Mean2010.967593
Median Absolute Deviation (MAD)2
Skewness-0.09023824754
Sum1737476
Variance8.779945067
MonotocityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%) 
201012414.4%
 
200910111.7%
 
20119911.5%
 
20139811.3%
 
2012859.8%
 
2014778.9%
 
2008657.5%
 
2015596.8%
 
2007374.3%
 
2006343.9%
 
Other values (6)859.8%
 
ValueCountFrequency (%) 
200320.2%
 
200470.8%
 
2005252.9%
 
2006343.9%
 
2007374.3%
 
ValueCountFrequency (%) 
201850.6%
 
2017131.5%
 
2016333.8%
 
2015596.8%
 
2014778.9%
 
Distinct18
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
일반전형
538 
수시2-일반전형
135 
수시전형
 
48
특별전형
 
36
농어촌전형
 
31
Other values (13)
76 
ValueCountFrequency (%) 
일반전형53862.3%
 
수시2-일반전형13515.6%
 
수시전형485.6%
 
특별전형364.2%
 
농어촌전형313.6%
 
수시-전공적성우수자192.2%
 
계약학과전형91.0%
 
일반편입91.0%
 
수능우수자특별전형80.9%
 
차세대지도자특별전형70.8%
 
Other values (8)242.8%
 
Frequencies of value counts

Unique

Unique2 ?
Unique (%)0.2%
Histogram of lengths of the category

Length

Max length12
Median length4
Mean length5.020833333
Min length2
Distinct6
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
정원내
811 
농어촌
 
31
위탁생
 
10
전문계고교특별
 
6
정원외전체
 
3
ValueCountFrequency (%) 
정원내81193.9%
 
농어촌313.6%
 
위탁생101.2%
 
전문계고교특별60.7%
 
정원외전체30.3%
 
재외국민30.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length7
Median length3
Mean length3.038194444
Min length3
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
657 
1
207 
ValueCountFrequency (%) 
065776.0%
 
120724.0%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
842 
1
 
22
ValueCountFrequency (%) 
084297.5%
 
1222.5%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
830 
1
 
34
ValueCountFrequency (%) 
083096.1%
 
1343.9%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
812 
1
 
52
ValueCountFrequency (%) 
081294.0%
 
1526.0%
 

토익점수
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct106
Distinct (%)32.7%
Missing540
Missing (%)62.5%
Infinite0
Infinite (%)0.0%
Mean701.8055556
Minimum195
Maximum955
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum195
5-th percentile455.75
Q1610
median710
Q3811.25
95-th percentile905
Maximum955
Range760
Interquartile range (IQR)201.25

Descriptive statistics

Standard deviation142.2357824
Coefficient of variation (CV)0.2026712118
Kurtosis-0.2339557858
Mean701.8055556
Median Absolute Deviation (MAD)100
Skewness-0.4363018254
Sum227385
Variance20231.0178
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
80580.9%
 
64070.8%
 
62070.8%
 
68570.8%
 
69570.8%
 
78560.7%
 
75560.7%
 
75060.7%
 
81560.7%
 
90550.6%
 
Other values (96)25930.0%
 
(Missing)54062.5%
 
ValueCountFrequency (%) 
19510.1%
 
30010.1%
 
34010.1%
 
35510.1%
 
37010.1%
 
ValueCountFrequency (%) 
95510.1%
 
95010.1%
 
94520.2%
 
94020.2%
 
93020.2%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
1
657 
0
207 
ValueCountFrequency (%) 
165776.0%
 
020724.0%
 

장학금액
Real number (ℝ≥0)

ZEROS

Distinct556
Distinct (%)64.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2041872.176
Minimum0
Maximum19738100
Zeros195
Zeros (%)22.6%
Memory size6.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q191702.5
median805315
Q32795480
95-th percentile7741714.5
Maximum19738100
Range19738100
Interquartile range (IQR)2703777.5

Descriptive statistics

Standard deviation3000288.701
Coefficient of variation (CV)1.469381255
Kurtosis8.184079598
Mean2041872.176
Median Absolute Deviation (MAD)805315
Skewness2.50876018
Sum1764177560
Variance9.001732289e+12
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
019522.6%
 
97380141.6%
 
138510091.0%
 
10279080.9%
 
17021080.9%
 
9197070.8%
 
10820070.8%
 
63520070.8%
 
9999050.6%
 
20558050.6%
 
Other values (546)59969.3%
 
ValueCountFrequency (%) 
019522.6%
 
1500010.1%
 
1515010.1%
 
2773010.1%
 
4328010.1%
 
ValueCountFrequency (%) 
1973810010.1%
 
1956698010.1%
 
1942508010.1%
 
1858035010.1%
 
1848445010.1%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
712 
1
152 
ValueCountFrequency (%) 
071282.4%
 
115217.6%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
819 
1
 
45
ValueCountFrequency (%) 
081994.8%
 
1455.2%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
746 
1
118 
ValueCountFrequency (%) 
074686.3%
 
111813.7%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
840 
1
 
24
ValueCountFrequency (%) 
084097.2%
 
1242.8%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
663 
1
201 
ValueCountFrequency (%) 
066376.7%
 
120123.3%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
1
489 
0
375 
ValueCountFrequency (%) 
148956.6%
 
037543.4%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
0
524 
1
340 
ValueCountFrequency (%) 
052460.6%
 
134039.4%
 

1학기성적
Real number (ℝ≥0)

Distinct185
Distinct (%)21.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.541099537
Minimum1.36
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1.36
5-th percentile2.7315
Q13.25
median3.58
Q33.89
95-th percentile4.2385
Maximum4.5
Range3.14
Interquartile range (IQR)0.64

Descriptive statistics

Standard deviation0.4716216144
Coefficient of variation (CV)0.133185077
Kurtosis0.4645993431
Mean3.541099537
Median Absolute Deviation (MAD)0.32
Skewness-0.5861250374
Sum3059.51
Variance0.2224269472
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4333.8%
 
3.5263.0%
 
3182.1%
 
3.63182.1%
 
3.53141.6%
 
3.58141.6%
 
3.71131.5%
 
3.83111.3%
 
3.25111.3%
 
3.79111.3%
 
Other values (175)69580.4%
 
ValueCountFrequency (%) 
1.3610.1%
 
220.2%
 
2.0910.1%
 
2.1110.1%
 
2.1710.1%
 
ValueCountFrequency (%) 
4.550.6%
 
4.4320.2%
 
4.4210.1%
 
4.4110.1%
 
4.3930.3%
 

2학기성적
Real number (ℝ≥0)

MISSING

Distinct192
Distinct (%)22.6%
Missing15
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean3.559705536
Minimum1.22
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1.22
5-th percentile2.644
Q13.26
median3.59
Q33.9
95-th percentile4.32
Maximum4.5
Range3.28
Interquartile range (IQR)0.64

Descriptive statistics

Standard deviation0.5096764447
Coefficient of variation (CV)0.1431793837
Kurtosis0.891464199
Mean3.559705536
Median Absolute Deviation (MAD)0.32
Skewness-0.6935908388
Sum3022.19
Variance0.2597700783
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3.5273.1%
 
4232.7%
 
3.58172.0%
 
3.55141.6%
 
3.75131.5%
 
3.44121.4%
 
3.89121.4%
 
3.03121.4%
 
3.86121.4%
 
3.29111.3%
 
Other values (182)69680.6%
 
(Missing)151.7%
 
ValueCountFrequency (%) 
1.2210.1%
 
1.520.2%
 
1.8310.1%
 
1.8620.2%
 
2.0410.1%
 
ValueCountFrequency (%) 
4.550.6%
 
4.4510.1%
 
4.4420.2%
 
4.4340.5%
 
4.4260.7%
 

3학기성적
Real number (ℝ≥0)

MISSING

Distinct172
Distinct (%)22.1%
Missing87
Missing (%)10.1%
Infinite0
Infinite (%)0.0%
Mean3.635225225
Minimum1
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1
5-th percentile2.698
Q13.34
median3.7
Q34
95-th percentile4.33
Maximum4.5
Range3.5
Interquartile range (IQR)0.66

Descriptive statistics

Standard deviation0.4994517427
Coefficient of variation (CV)0.1373922417
Kurtosis1.462760887
Mean3.635225225
Median Absolute Deviation (MAD)0.3
Skewness-0.8556785595
Sum2824.57
Variance0.2494520433
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4343.9%
 
3.5293.4%
 
3.83172.0%
 
3.79141.6%
 
3.25131.5%
 
4.5121.4%
 
3121.4%
 
3.53121.4%
 
4.25121.4%
 
3.92111.3%
 
Other values (162)61170.7%
 
(Missing)8710.1%
 
ValueCountFrequency (%) 
110.1%
 
1.310.1%
 
230.3%
 
2.1310.1%
 
2.2410.1%
 
ValueCountFrequency (%) 
4.5121.4%
 
4.4510.1%
 
4.4420.2%
 
4.4340.5%
 
4.4250.6%
 

4학기성적
Real number (ℝ≥0)

MISSING

Distinct176
Distinct (%)23.2%
Missing104
Missing (%)12.0%
Infinite0
Infinite (%)0.0%
Mean3.685789474
Minimum2
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2
5-th percentile2.769
Q13.38
median3.76
Q34.06
95-th percentile4.38
Maximum4.5
Range2.5
Interquartile range (IQR)0.68

Descriptive statistics

Standard deviation0.4946324893
Coefficient of variation (CV)0.1341998757
Kurtosis0.1683223095
Mean3.685789474
Median Absolute Deviation (MAD)0.33
Skewness-0.6930527779
Sum2801.2
Variance0.2446612995
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4364.2%
 
3.5232.7%
 
4.5192.2%
 
3.75161.9%
 
3.88151.7%
 
3.83151.7%
 
4.25141.6%
 
3131.5%
 
4.06111.3%
 
4.08111.3%
 
Other values (166)58767.9%
 
(Missing)10412.0%
 
ValueCountFrequency (%) 
220.2%
 
2.110.1%
 
2.1720.2%
 
2.2510.1%
 
2.2710.1%
 
ValueCountFrequency (%) 
4.5192.2%
 
4.4520.2%
 
4.4420.2%
 
4.4360.7%
 
4.4230.3%
 

5학기성적
Real number (ℝ≥0)

MISSING

Distinct162
Distinct (%)24.8%
Missing212
Missing (%)24.5%
Infinite0
Infinite (%)0.0%
Mean3.701058282
Minimum2.06
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2.06
5-th percentile2.84
Q13.42
median3.76
Q34.05
95-th percentile4.38
Maximum4.5
Range2.44
Interquartile range (IQR)0.63

Descriptive statistics

Standard deviation0.467411031
Coefficient of variation (CV)0.1262911836
Kurtosis0.03224677293
Mean3.701058282
Median Absolute Deviation (MAD)0.32
Skewness-0.5800044448
Sum2413.09
Variance0.2184730719
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3.5222.5%
 
3192.2%
 
4182.1%
 
3.83141.6%
 
3.58131.5%
 
4.08131.5%
 
3.9121.4%
 
3.88111.3%
 
4.5111.3%
 
4.17101.2%
 
Other values (152)50958.9%
 
(Missing)21224.5%
 
ValueCountFrequency (%) 
2.0610.1%
 
2.1710.1%
 
2.2910.1%
 
2.3210.1%
 
2.3310.1%
 
ValueCountFrequency (%) 
4.5111.3%
 
4.4610.1%
 
4.4510.1%
 
4.4340.5%
 
4.4250.6%
 

6학기성적
Real number (ℝ≥0)

MISSING

Distinct149
Distinct (%)24.7%
Missing261
Missing (%)30.2%
Infinite0
Infinite (%)0.0%
Mean3.731525705
Minimum1.5
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1.5
5-th percentile2.97
Q13.445
median3.79
Q34.085
95-th percentile4.369
Maximum4.5
Range3
Interquartile range (IQR)0.64

Descriptive statistics

Standard deviation0.4606336751
Coefficient of variation (CV)0.1234437899
Kurtosis0.6740922866
Mean3.731525705
Median Absolute Deviation (MAD)0.31
Skewness-0.7043152425
Sum2250.11
Variance0.2121833826
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4283.2%
 
3.5252.9%
 
4.17172.0%
 
4.5141.6%
 
3.83121.4%
 
3.9111.3%
 
3.71111.3%
 
3101.2%
 
3.91101.2%
 
3.6101.2%
 
Other values (139)45552.7%
 
(Missing)26130.2%
 
ValueCountFrequency (%) 
1.510.1%
 
2.2810.1%
 
2.2930.3%
 
2.540.5%
 
2.5610.1%
 
ValueCountFrequency (%) 
4.5141.6%
 
4.4520.2%
 
4.4420.2%
 
4.4260.7%
 
4.4130.3%
 

7학기성적
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct124
Distinct (%)26.5%
Missing396
Missing (%)45.8%
Infinite0
Infinite (%)0.0%
Mean3.888611111
Minimum2.06
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2.06
5-th percentile3.1
Q13.65
median3.93
Q34.2
95-th percentile4.43
Maximum4.5
Range2.44
Interquartile range (IQR)0.55

Descriptive statistics

Standard deviation0.4123495046
Coefficient of variation (CV)0.1060403041
Kurtosis0.9173109451
Mean3.888611111
Median Absolute Deviation (MAD)0.27
Skewness-0.8910508774
Sum1819.87
Variance0.170032114
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.5222.5%
 
4202.3%
 
3.5192.2%
 
3.9121.4%
 
4.3111.3%
 
3.93111.3%
 
4.25111.3%
 
3.75111.3%
 
4.08101.2%
 
4.3391.0%
 
Other values (114)33238.4%
 
(Missing)39645.8%
 
ValueCountFrequency (%) 
2.0610.1%
 
2.510.1%
 
2.610.1%
 
2.6720.2%
 
2.6910.1%
 
ValueCountFrequency (%) 
4.5222.5%
 
4.4510.1%
 
4.4320.2%
 
4.4240.5%
 
4.4150.6%
 

8학기성적
Real number (ℝ≥0)

MISSING

Distinct115
Distinct (%)30.5%
Missing487
Missing (%)56.4%
Infinite0
Infinite (%)0.0%
Mean3.859973475
Minimum1.43
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1.43
5-th percentile3.096
Q13.58
median3.93
Q34.21
95-th percentile4.5
Maximum4.5
Range3.07
Interquartile range (IQR)0.63

Descriptive statistics

Standard deviation0.4742398726
Coefficient of variation (CV)0.1228609149
Kurtosis2.792619932
Mean3.859973475
Median Absolute Deviation (MAD)0.32
Skewness-1.161972532
Sum1455.21
Variance0.2249034567
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.5333.8%
 
4273.1%
 
4.33161.9%
 
4.25131.5%
 
3.75101.2%
 
3.5101.2%
 
3.83101.2%
 
4.17101.2%
 
4.1391.0%
 
3.8891.0%
 
Other values (105)23026.6%
 
(Missing)48756.4%
 
ValueCountFrequency (%) 
1.4310.1%
 
1.7510.1%
 
210.1%
 
2.0410.1%
 
2.510.1%
 
ValueCountFrequency (%) 
4.5333.8%
 
4.4610.1%
 
4.4310.1%
 
4.4210.1%
 
4.4110.1%
 

9학기성적
Real number (ℝ≥0)

MISSING

Distinct56
Distinct (%)61.5%
Missing773
Missing (%)89.5%
Infinite0
Infinite (%)0.0%
Mean3.683846154
Minimum0.75
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum0.75
5-th percentile2.53
Q13.31
median3.83
Q34.165
95-th percentile4.5
Maximum4.5
Range3.75
Interquartile range (IQR)0.855

Descriptive statistics

Standard deviation0.6689158214
Coefficient of variation (CV)0.1815808243
Kurtosis3.189977794
Mean3.683846154
Median Absolute Deviation (MAD)0.37
Skewness-1.364398171
Sum335.23
Variance0.4474483761
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.5101.2%
 
4101.2%
 
3.580.9%
 
3.230.3%
 
2.520.2%
 
4.3320.2%
 
3.120.2%
 
3.8820.2%
 
3.6320.2%
 
3.7520.2%
 
Other values (46)485.6%
 
(Missing)77389.5%
 
ValueCountFrequency (%) 
0.7510.1%
 
1.910.1%
 
2.2510.1%
 
2.520.2%
 
2.5610.1%
 
ValueCountFrequency (%) 
4.5101.2%
 
4.4210.1%
 
4.4110.1%
 
4.3810.1%
 
4.3320.2%
 

10학기성적
Real number (ℝ≥0)

MISSING

Distinct18
Distinct (%)75.0%
Missing840
Missing (%)97.2%
Infinite0
Infinite (%)0.0%
Mean3.585
Minimum1.5
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1.5
5-th percentile2.56
Q13.1225
median3.73
Q34.0325
95-th percentile4.482
Maximum4.5
Range3
Interquartile range (IQR)0.91

Descriptive statistics

Standard deviation0.7126008701
Coefficient of variation (CV)0.1987729066
Kurtosis1.649879269
Mean3.585
Median Absolute Deviation (MAD)0.46
Skewness-1.08861886
Sum86.04
Variance0.5078
MonotocityNot monotonic
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
450.6%
 
320.2%
 
4.520.2%
 
1.510.1%
 
3.7510.1%
 
4.2510.1%
 
3.2510.1%
 
2.510.1%
 
2.910.1%
 
3.1410.1%
 
Other values (8)80.9%
 
(Missing)84097.2%
 
ValueCountFrequency (%) 
1.510.1%
 
2.510.1%
 
2.910.1%
 
320.2%
 
3.0710.1%
 
ValueCountFrequency (%) 
4.520.2%
 
4.3810.1%
 
4.2510.1%
 
4.1710.1%
 
4.1310.1%
 

11학기성적
Categorical

HIGH CORRELATION
MISSING
UNIFORM

Distinct4
Distinct (%)100.0%
Missing860
Missing (%)99.5%
Memory size6.8 KiB
3.57
1
2.5
4.5
ValueCountFrequency (%) 
3.5710.1%
 
110.1%
 
2.510.1%
 
4.510.1%
 
(Missing)86099.5%
 
Frequencies of value counts

Unique

Unique4 ?
Unique (%)100.0%
Histogram of lengths of the category

Length

Max length4
Median length3
Mean length3.001157407
Min length3

12학기성적
Categorical

MISSING

Distinct1
Distinct (%)100.0%
Missing863
Missing (%)99.9%
Memory size6.8 KiB
2
ValueCountFrequency (%) 
210.1%
 
(Missing)86399.9%
 
Frequencies of value counts

Unique

Unique1 ?
Unique (%)100.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

학부평균성적
Real number (ℝ≥0)

Distinct696
Distinct (%)80.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.691179616
Minimum2.09
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2.09
5-th percentile3.077472222
Q13.468229167
median3.717638889
Q33.946770833
95-th percentile4.234625
Maximum4.5
Range2.41
Interquartile range (IQR)0.4785416667

Descriptive statistics

Standard deviation0.3666616102
Coefficient of variation (CV)0.09933453485
Kurtosis0.5112842475
Mean3.691179616
Median Absolute Deviation (MAD)0.2385714286
Skewness-0.5541871623
Sum3189.179188
Variance0.1344407364
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3.7470.8%
 
3.9350.6%
 
3.6850.6%
 
4.152540.5%
 
3.4340.5%
 
4.540.5%
 
3.2940.5%
 
3.862540.5%
 
3.56540.5%
 
3.772540.5%
 
Other values (686)81994.8%
 
ValueCountFrequency (%) 
2.0910.1%
 
2.3710.1%
 
2.520.2%
 
2.53777777810.1%
 
2.56810.1%
 
ValueCountFrequency (%) 
4.540.5%
 
4.48833333310.1%
 
4.38833333310.1%
 
4.387510.1%
 
4.3812510.1%
 

직전2학기평균
Real number (ℝ≥0)

HIGH CORRELATION

Distinct283
Distinct (%)32.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.849450231
Minimum1.805
Maximum4.5
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum1.805
5-th percentile3.071
Q13.625
median3.925
Q34.16125
95-th percentile4.375
Maximum4.5
Range2.695
Interquartile range (IQR)0.53625

Descriptive statistics

Standard deviation0.419771656
Coefficient of variation (CV)0.1090471706
Kurtosis2.233665331
Mean3.849450231
Median Absolute Deviation (MAD)0.265
Skewness-1.189477687
Sum3325.925
Variance0.1762082432
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.25161.9%
 
4.5141.6%
 
4131.5%
 
3.92111.3%
 
3.94591.0%
 
3.85580.9%
 
3.880.9%
 
3.8780.9%
 
4.1880.9%
 
3.9880.9%
 
Other values (273)76188.1%
 
ValueCountFrequency (%) 
1.80510.1%
 
1.9510.1%
 
2.0310.1%
 
2.0910.1%
 
2.13510.1%
 
ValueCountFrequency (%) 
4.5141.6%
 
4.46510.1%
 
4.4630.3%
 
4.4520.2%
 
4.4420.2%
 

성적오름추세
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct332
Distinct (%)38.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2944675926
Minimum-2.5
Maximum2.77
Zeros26
Zeros (%)3.0%
Memory size6.8 KiB

Quantile statistics

Minimum-2.5
5-th percentile-0.62
Q1-0.0325
median0.3
Q30.65
95-th percentile1.21
Maximum2.77
Range5.27
Interquartile range (IQR)0.6825

Descriptive statistics

Standard deviation0.5773668152
Coefficient of variation (CV)1.960714285
Kurtosis2.111138944
Mean0.2944675926
Median Absolute Deviation (MAD)0.34
Skewness-0.4267372996
Sum254.42
Variance0.3333524393
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0263.0%
 
0.33151.7%
 
0.18111.3%
 
0.591.0%
 
0.3891.0%
 
0.6491.0%
 
0.5991.0%
 
-0.0280.9%
 
0.4280.9%
 
0.2580.9%
 
Other values (322)75287.0%
 
ValueCountFrequency (%) 
-2.510.1%
 
-2.4210.1%
 
-2.310.1%
 
-1.8310.1%
 
-1.7910.1%
 
ValueCountFrequency (%) 
2.7710.1%
 
1.8310.1%
 
1.7810.1%
 
1.7510.1%
 
1.7210.1%
 

성적기울기
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct361
Distinct (%)41.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.05410648148
Minimum-1.25
Maximum0.91
Zeros25
Zeros (%)2.9%
Memory size6.8 KiB

Quantile statistics

Minimum-1.25
5-th percentile-0.15525
Q10
median0.065
Q30.129
95-th percentile0.23485
Maximum0.91
Range2.16
Interquartile range (IQR)0.129

Descriptive statistics

Standard deviation0.1639914017
Coefficient of variation (CV)3.030901238
Kurtosis15.43945971
Mean0.05410648148
Median Absolute Deviation (MAD)0.065
Skewness-2.099298695
Sum46.748
Variance0.02689317984
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0252.9%
 
0.07480.9%
 
0.08580.9%
 
0.03580.9%
 
0.06770.8%
 
0.05270.8%
 
0.05770.8%
 
0.0160.7%
 
0.0960.7%
 
0.06960.7%
 
Other values (351)77689.8%
 
ValueCountFrequency (%) 
-1.2510.1%
 
-0.9910.1%
 
-0.9810.1%
 
-0.89510.1%
 
-0.8910.1%
 
ValueCountFrequency (%) 
0.9110.1%
 
0.8710.1%
 
0.6210.1%
 
0.6110.1%
 
0.5320.2%
 

성적기울기_직전4학기
Real number (ℝ)

ZEROS

Distinct467
Distinct (%)54.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.05740625
Minimum-1.25
Maximum0.91
Zeros23
Zeros (%)2.7%
Memory size6.8 KiB

Quantile statistics

Minimum-1.25
5-th percentile-0.2667
Q1-0.03325
median0.069
Q30.17725
95-th percentile0.3497
Maximum0.91
Range2.16
Interquartile range (IQR)0.2105

Descriptive statistics

Standard deviation0.2101915266
Coefficient of variation (CV)3.661474607
Kurtosis5.397603983
Mean0.05740625
Median Absolute Deviation (MAD)0.104
Skewness-1.083158803
Sum49.599
Variance0.04418047788
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0232.7%
 
0.1880.9%
 
0.0370.8%
 
0.0560.7%
 
-0.03260.7%
 
0.10960.7%
 
0.03460.7%
 
0.1950.6%
 
0.05350.6%
 
0.06150.6%
 
Other values (457)78791.1%
 
ValueCountFrequency (%) 
-1.2510.1%
 
-0.9910.1%
 
-0.9810.1%
 
-0.89510.1%
 
-0.8910.1%
 
ValueCountFrequency (%) 
0.9110.1%
 
0.8710.1%
 
0.72710.1%
 
0.63710.1%
 
0.6210.1%
 

직전2학기증가율
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct380
Distinct (%)44.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.002222222222
Minimum-0.667
Maximum0.839
Zeros51
Zeros (%)5.9%
Memory size6.8 KiB

Quantile statistics

Minimum-0.667
5-th percentile-0.2027
Q1-0.067
median0
Q30.059
95-th percentile0.19385
Maximum0.839
Range1.506
Interquartile range (IQR)0.126

Descriptive statistics

Standard deviation0.1342417672
Coefficient of variation (CV)-60.40879525
Kurtosis5.837836173
Mean-0.002222222222
Median Absolute Deviation (MAD)0.0635
Skewness0.1170918501
Sum-1.92
Variance0.01802085207
MonotocityDecreasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0515.9%
 
-0.03491.0%
 
-0.00580.9%
 
0.01970.8%
 
-0.05970.8%
 
-0.01470.8%
 
0.00370.8%
 
-0.0170.8%
 
0.04770.8%
 
0.05570.8%
 
Other values (370)74786.5%
 
ValueCountFrequency (%) 
-0.66710.1%
 
-0.65510.1%
 
-0.61710.1%
 
-0.59610.1%
 
-0.51210.1%
 
ValueCountFrequency (%) 
0.83910.1%
 
0.68510.1%
 
0.62510.1%
 
0.57910.1%
 
0.46710.1%
 

대학원_계열
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
공학계열
691 
예/체능계열
87 
자연과학계열
80 
인문사회계열
 
6
ValueCountFrequency (%) 
공학계열69180.0%
 
예/체능계열8710.1%
 
자연과학계열809.3%
 
인문사회계열60.7%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length6
Median length4
Mean length4.400462963
Min length4

대학원_학과
Categorical

HIGH CORRELATION

Distinct35
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
기계설계로봇공학과
111 
정밀화학과
65 
전기정보공학과
61 
건설시스템공학과
56 
신소재공학과
53 
Other values (30)
518 
ValueCountFrequency (%) 
기계설계로봇공학과11112.8%
 
정밀화학과657.5%
 
전기정보공학과617.1%
 
건설시스템공학과566.5%
 
신소재공학과536.1%
 
기계공학과485.6%
 
건축과475.4%
 
환경공학과435.0%
 
자동차공학과394.5%
 
기계디자인금형공학과354.1%
 
Other values (25)30635.4%
 
Frequencies of value counts

Unique

Unique5 ?
Unique (%)0.6%
Histogram of lengths of the category

Length

Max length15
Median length6
Mean length6.722222222
Min length3

대학원_졸업연도
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct7
Distinct (%)1.2%
Missing276
Missing (%)31.9%
Infinite0
Infinite (%)0.0%
Mean2017.435374
Minimum2014
Maximum2020
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2014
5-th percentile2014
Q12016
median2018
Q32019
95-th percentile2020
Maximum2020
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.892576828
Coefficient of variation (CV)0.0009381102622
Kurtosis-1.067944556
Mean2017.435374
Median Absolute Deviation (MAD)1
Skewness-0.2906693867
Sum1186252
Variance3.581847049
MonotocityNot monotonic
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
201911913.8%
 
20209511.0%
 
20179110.5%
 
20189110.5%
 
2016778.9%
 
2015647.4%
 
2014515.9%
 
(Missing)27631.9%
 
ValueCountFrequency (%) 
2014515.9%
 
2015647.4%
 
2016778.9%
 
20179110.5%
 
20189110.5%
 
ValueCountFrequency (%) 
20209511.0%
 
201911913.8%
 
20189110.5%
 
20179110.5%
 
2016778.9%
 

대학원_입학연도
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.609954
Minimum2012
Maximum2020
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2012
5-th percentile2012
Q12014
median2017
Q32019
95-th percentile2020
Maximum2020
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.609270041
Coefficient of variation (CV)0.0012938893
Kurtosis-1.212957948
Mean2016.609954
Median Absolute Deviation (MAD)2
Skewness-0.2038855726
Sum1742351
Variance6.808290148
MonotocityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
202017019.7%
 
201910912.6%
 
201510111.7%
 
201710011.6%
 
20148910.3%
 
20168610.0%
 
2018789.0%
 
2013677.8%
 
2012647.4%
 
ValueCountFrequency (%) 
2012647.4%
 
2013677.8%
 
20148910.3%
 
201510111.7%
 
20168610.0%
 
ValueCountFrequency (%) 
202017019.7%
 
201910912.6%
 
2018789.0%
 
201710011.6%
 
20168610.0%
 
Distinct15
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.642361111
Minimum2
Maximum17
Zeros0
Zeros (%)0.0%
Memory size6.8 KiB

Quantile statistics

Minimum2
5-th percentile2
Q14
median6
Q37
95-th percentile8
Maximum17
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.798186799
Coefficient of variation (CV)0.3186940295
Kurtosis3.496780439
Mean5.642361111
Median Absolute Deviation (MAD)1
Skewness0.6448966491
Sum4875
Variance3.233475763
MonotocityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%) 
628132.5%
 
415317.7%
 
713015.0%
 
512114.0%
 
8637.3%
 
2465.3%
 
3343.9%
 
9202.3%
 
1091.0%
 
1120.2%
 
Other values (5)50.6%
 
ValueCountFrequency (%) 
2465.3%
 
3343.9%
 
415317.7%
 
512114.0%
 
628132.5%
 
ValueCountFrequency (%) 
1710.1%
 
1610.1%
 
1410.1%
 
1310.1%
 
1210.1%
 
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
1
856 
0
 
8
ValueCountFrequency (%) 
185699.1%
 
080.9%
 

km_5
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.822916667
Minimum0
Maximum4
Zeros172
Zeros (%)19.9%
Memory size6.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile4
Maximum4
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.318337663
Coefficient of variation (CV)0.7232023753
Kurtosis-1.291460599
Mean1.822916667
Median Absolute Deviation (MAD)1
Skewness0.0702669564
Sum1575
Variance1.738014195
MonotocityNot monotonic
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
325429.4%
 
123727.4%
 
017219.9%
 
211413.2%
 
48710.1%
 
ValueCountFrequency (%) 
017219.9%
 
123727.4%
 
211413.2%
 
325429.4%
 
48710.1%
 
ValueCountFrequency (%) 
48710.1%
 
325429.4%
 
211413.2%
 
123727.4%
 
017219.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

Unnamed: 0학생통계번호성별학부_단과대학학부_학과학부_주야구분학부_편입여부학부_입학구분학부_입학년도학부_전형구분학부_정원내외구분현장실습이수여부복수전공여부부전공여부교환학생여부토익점수수도권_거주여부장학금액학습역량 여부진로,심리 여부취업,진로 여부창업 여부비교과 여부휴학_기타휴학_군대1학기성적2학기성적3학기성적4학기성적5학기성적6학기성적7학기성적8학기성적9학기성적10학기성적11학기성적12학기성적학부평균성적직전2학기평균성적오름추세성적기울기성적기울기_직전4학기직전2학기증가율대학원_계열대학원_학과대학원_졸업연도대학원_입학연도학부입학 후 대학원입학까지 걸린 시간학부와 대학원 전공의 계열 일치 여부km_5
008567334639여자조형대학디자인학과주간0신입학2010수시2-특기자전형정원내0010NaN1000000103.093.132.693.393.463.002.751.432.63NaNNaNNaN2.8411112.030-0.46-0.120-0.2430.839예/체능계열시각디자인학과2020.02018810
118172064647남자공과대학신소재공학과야간0신입학2009수시2-산업체근무경력자정원내0010NaN1000000103.864.003.504.113.583.002.674.50NaNNaNNaNNaN3.6525003.5850.64-0.0500.2430.685공학계열기계공학과NaN20191011
228028864651여자공과대학안전공학과주간0신입학2016수시전형정원내0000760.01011101002.883.003.152.853.043.213.032.774.50NaNNaNNaN3.1588893.6351.620.0990.3610.625공학계열안전공학과NaN2020414
338177054641남자조형대학도예학과주간0신입학2013일반전형정원내0000NaN1000000112.352.282.582.373.122.503.004.121.903.00NaNNaN2.7220002.4500.650.074-0.2220.579예/체능계열도예학과NaN2020713
448156944635남자기술경영융합대학산업정보시스템전공(학부)주간0신입학2012일반전형정원내0100895.01000000113.342.973.633.363.433.193.603.272.253.30NaNNaN3.2340002.775-0.04-0.041-0.1920.467공학계열안전공학과NaN2020813
558207454648남자에너지바이오대학식품공학과주간0신입학2013일반전형정원내0000830.01010101113.062.753.032.553.003.253.253.633.643.074.5NaN3.2481823.7851.440.1160.2040.466공학계열식품공학과NaN2020712
668447354643남자기술경영융합대학MSDE 학과주간0신입학2008일반전형정원내0000NaN0130086000000004.003.504.003.332.673.83NaNNaNNaNNaNNaNNaN3.5550003.250-0.17-0.115-0.1170.434공학계열기계설계로봇공학과2017.02014611
778107584651여자에너지바이오대학화공생명공학과주간0신입학2013차세대지도자특별전형정원내1000515.01196210000000003.002.873.052.923.363.073.253.693.204.50NaNNaN3.2910003.8501.500.1190.3260.406공학계열화학공학과2020.02018510
888117124641남자정보통신대학컴퓨터공학과주간0신입학2008일반전형정원내0100NaN19197000000003.573.554.084.063.443.044.25NaNNaNNaNNaNNaN3.7128573.6450.680.0140.0170.398공학계열SW분석·설계학과2016.02014611
998037674645남자공과대학기계·자동차공학과주간0신입학2009일반전형정원내0000NaN1290200000000114.113.543.653.383.003.424.50NaNNaNNaNNaNNaN3.6571433.9600.390.0100.3780.316공학계열기계공학과NaN20191013

Last rows

Unnamed: 0학생통계번호성별학부_단과대학학부_학과학부_주야구분학부_편입여부학부_입학구분학부_입학년도학부_전형구분학부_정원내외구분현장실습이수여부복수전공여부부전공여부교환학생여부토익점수수도권_거주여부장학금액학습역량 여부진로,심리 여부취업,진로 여부창업 여부비교과 여부휴학_기타휴학_군대1학기성적2학기성적3학기성적4학기성적5학기성적6학기성적7학기성적8학기성적9학기성적10학기성적11학기성적12학기성적학부평균성적직전2학기평균성적오름추세성적기울기성적기울기_직전4학기직전2학기증가율대학원_계열대학원_학과대학원_졸업연도대학원_입학연도학부입학 후 대학원입학까지 걸린 시간학부와 대학원 전공의 계열 일치 여부km_5
8548548217164646여자조형대학조형예술학과주간0신입학2008일반전형정원내0000NaN1134688000000103.903.892.50NaNNaNNaNNaNNaNNaNNaNNaNNaN3.4300003.195-1.40-0.700-0.700-0.357예/체능계열조형예술과2020.02016810
8558558187324640남자조형대학금속공예디자인학과주간0신입학2012일반전형정원내0000NaN1596495001011113.553.653.924.213.603.783.924.332.50NaNNaNNaN3.7177783.415-1.05-0.043-0.343-0.423예/체능계열금속공예디자인학과NaN2020812
8568568437724642남자공과대학건설시스템공학과주간0신입학2012수시전형정원내0100815.01277020000000113.003.723.252.503.483.283.502.00NaNNaNNaNNaN3.0912502.750-1.00-0.084-0.422-0.429공학계열건설시스템공학과NaN2020813
8578578207134643남자에너지바이오대학식품공학과주간0신입학2013수시전형정원내1000410.01010111112.462.442.502.393.053.242.953.234.502.5NaNNaN2.9260003.5000.040.123-0.008-0.444공학계열식품공학과NaN2020712
8588588838404651남자공과대학기계시스템디자인공학과주간0신입학2012일반전형정원내1000780.01559547000000003.874.213.923.383.923.503.932.04NaNNaNNaNNaN3.5962502.985-1.83-0.178-0.521-0.481공학계열기계설계로봇공학과2018.02016411
8598598447254636남자공과대학기계시스템디자인공학과주간0신입학2005일반전형정원내0000NaN154708000000003.794.102.00NaNNaNNaNNaNNaNNaNNaNNaNNaN3.2966673.050-1.79-0.895-0.895-0.512공학계열기계설계로봇공학과2015.02013811
8608608157284643남자기술경영융합대학산업정보시스템전공(학부)주간0신입학2008일반전형정원내1000590.01000000113.063.003.523.923.711.50NaNNaNNaNNaNNaNNaN3.1183332.605-1.56-0.151-0.627-0.596공학계열SW분석·설계학과2019.02017913
8618618066964632남자공과대학토목공학전공주간0신입학2005특별전형정원내0000NaN115148000000003.502.611.00NaNNaNNaNNaNNaNNaNNaNNaNNaN2.3700001.805-2.50-1.250-1.250-0.617공학계열건설시스템공학과2015.02012711
8628628437134637남자공과대학건설시스템공학과주간0신입학2012일반전형정원내0000NaN0150859001001113.423.984.003.873.803.252.803.830.752.91.0NaN3.0545451.950-2.42-0.248-0.634-0.655공학계열건설시스템공학과NaN2020812
8638638837344644남자공과대학기계시스템디자인공학과주간0신입학2012일반전형정원내0000725.01011101113.262.712.503.002.863.033.583.084.501.5NaNNaN3.0020003.000-1.760.009-0.482-0.667공학계열기계설계로봇공학과NaN2020812